73 research outputs found

    A Taxonomy of Privacy-Preserving Record Linkage Techniques

    Get PDF
    The process of identifying which records in two or more databases correspond to the same entity is an important aspect of data quality activities such as data pre-processing and data integration. Known as record linkage, data matching or entity resolution, this process has attracted interest from researchers in fields such as databases and data warehousing, data mining, information systems, and machine learning. Record linkage has various challenges, including scalability to large databases, accurate matching and classification, and privacy and confidentiality. The latter challenge arises because commonly personal identifying data, such as names, addresses and dates of birth of individuals, are used in the linkage process. When databases are linked across organizations, the issue of how to protect the privacy and confidentiality of such sensitive information is crucial to successful application of record linkage. In this paper we present an overview of techniques that allow the linking of databases between organizations while at the same time preserving the privacy of these data. Known as 'privacy-preserving record linkage' (PPRL), various such techniques have been developed. We present a taxonomy of PPRL techniques to characterize these techniques along 15 dimensions, and conduct a survey of PPRL techniques. We then highlight shortcomings of current techniques and discuss avenues for future research

    An Efficient Two-Party Protocol for Approximate Matching in Private Record Linkage

    Get PDF
    The task of linking multiple databases with the aim to identify records that refer to the same entity is occurring increasingly in many application areas. If unique identifiers for the entities are not available in all the databases to be linked, techniques that calculate approximate similarities between records must be used for the identification of matching pairs of records. Often, the records to be linked contain personal information such as names and addresses. In many applications, the exchange of attribute values that contain such personal details between organisations is not allowed due to privacy concerns. The linking of records between databases without revealing the actual attribute values in these records is the research problem known as 'privacy-preserving record linkage' (PPRL).While various approaches have been proposed to deal with privacy within the record linkage process, a viable solution that is well applicable to real-world conditions needs to address the major aspect of scalability of linking very large databases while preserving security and linkage quality. We propose a novel two-party protocol for PPRL that addresses scalability, security and quality/ accuracy. The protocol is based on (1) the use of reference values that are available to both database owners, and allows them to individually calculate the similarities between their attribute values and the reference values; and (2) the binning of these calculated similarity values to allow their secure exchange between the two database owners. Experiments on a real-world database with nearly two million records yield linkage results that have a linear scalability to large databases and high linkage accuracy, allowing for approximate matching in the privacy-preserving context. Since the protocol has a low computational burden and allows quality approximate matching while still preserving the privacy of the databases that are matched, the protocol can be useful for many real-world applications requiring PPRL

    Glas djece i poboljšanje škole: Uloga tehnologije u inkluzivnoj školi budućnosti

    Get PDF
    The purpose of the study is to indicate the importance of inclusive education. This study focuses on the need for student’s voice to be heard and consid¬ered during educa¬tional planning. By doing this, the study intends to detect which students are marginalized, and use this method towards school improvement. The research was con¬ducted in two kindergartens. In the first, twenty-two students and two teachers were interviewed (kindergarten teacher and headmistress), while in the second, we used the method of non-participant observa¬tion with five children attending the integration class, as well as the teachers, during one month. Interviews were conducted in the second school, too, only with two teachers (qualified pedagogue and headmistress). During our study, margin¬alized students have been identified due to the difficulties they face, and we have managed to strengthen children’s confidence, and help create a more empathetic school environment. These have been implemented through a school pro¬ject in which students narrate their personal stories using digital storytelling software. Our aim has also been to encourage the creation of stories for the pro-duction of which Scratch digital storytelling software was used.Zadatak ove studije je naglasiti važnost inkluzivnog obrazovanja. Ova je studija usmjerena na potrebu za slušanjem i razmatranjem mišljenja učenika prilikom plani¬ranja obrazovanja. Na taj način studija planira otkriti koji su učenici marginalizirani i iskoristiti ovu metodu za poboljšanje školske jedinice. Istraživanje je provedeno u dva vrtića. U prvom su intervjuirana dvadeset i dva polaznika i dvije odgojiteljice (tete u vrtiću i ravnateljica), a u drugom smo se koristili metodom neaktivnog promatrača petero djece koja su pohađala integrirani razred, i odgojiteljice tijekom jednog mjeseca. Intervjui su provedeni i u drugom vrtiću samo s dvoje odgojitelja (kvalificirani pedagog i ravnateljica). Marginalizirani učenici se identificiraju zbog poteškoća s kojima se suočavaju. Tijekom provođenja studije uspjeli smo pojačati samopouz¬danje učenika i pomogli stvoriti suosjećanije školsko okruženje. To se postiže u vrtićkom projektu u kojem polaznici s pomoću digitalnog softvera za pripovijedanje pričaju svoje osobne priče. Naš cilj bio je i potaknuti stvaranje priča za produkciju, za što se koristio digitalni softver za pripovijedanje Scratch

    On The Accuracy and Completeness of The Record Matching Process

    Get PDF
    Abstract. Record matching or linking is one of the phases of the data quality improvement process, in which, records from different sources, are cleansed and integrated in a centralized data store to be used for various purposes. Both, earlier and recent studies in data quality and record linkage focus on various statistical models, which make strong assumptions on the probabilities of attribute errors. In this study, we evaluate different models for record linkage, which are built based on data only. We use a program that generates data with known error distributions and we train classification models, which we use to estimate the accuracy and the completeness of the record linking process. The results indicate that the automated learning techniques are adequate for this process and that both their accuracy and their completeness are comparable to the accuracy and the completeness of other, mostly manual, processes

    Student Admission Data Analytics for Open and Distance Education in Greece

    Get PDF
    Over the last few decades, distance learning has become very popular, as a result of the many pros it offers, along with its flexibility. The need for a better understanding of the data originating from such educational environments has led to the rise of the Educational Data Mining research field. However, most of the studies so far focus on the analysis of the data being collected during and/or after the distance learning courses. In this paper, we study the demographical data related to student applications for acceptance in distance learning programs offered by the Hellenic Open University, during the decade from 2003 to 2013. Our study aims at analyzing the data, and discovering patterns and knowledge that can be used to help the strategic placement of the university, and the improvement of the experience that offers to its students. Moreover, we attempt to correlate the discovered findings with the social and financial status of the applicants’ environment

    Parallel ELLPACK 3-D Problem Solving Environment

    Get PDF
    corecore